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Office Action Summary 
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Examiner 

Martin Lerner 



Appiicant(s) 

KLAKOW ETAL. 



Art Unit 

2654 



The MAILING DATE of this communication appears on the cover sheet with the correspondence address 
Period for Reply 

A SHORTENED STATUTORY PERIOD FOR REPLY IS SET TO EXPIRE 3 MONTH(S) FROM 
THE MAILING DATE OF THIS COMMUNICATION. 

- Extensions of time may be available under the provisions of 37 CFR 1.136(a). In no event, however, may a reply be timely filed 
after SIX (6) MONTHS from the mailing date of this communication. 

- If the period for reply specified above is less than thirty (30) days, a reply within the statutory minimum of thirty (30) days will be considered timely. 

- If NO period for reply is specified above, the maximum statutory period will apply and will expire SIX (6) MONTHS from the mailing date of this communication. 

- Failure to reply within the set or extended period for reply will, by statute, cause the application to become ABANDONED (35 U.S.C. § 133). 

- Any reply received by the Office later than three months after the mailing date of this communication, even if timely filed, may reduce any 
earned patent term adjustment. See 37 CFR 1.704(b). 

Status 

1 )□ Responsive to communication(s) filed on . 

2a)D This action is FINAL. 2b)K This action is non-final. 

3) D Since this application is in condition for allowance except for formal matters, prosecution as to the merits is 

closed in accordance with the practice under Ex parte Quay/e, 1935 CD. 1 1 , 453 O.G. 213. 
Disposition of Claims 

4) ^ Claim(s) 1 to 10 is/are pending in the application. 

4a) Of the above claim(s) is/are withdrawn from consideration. 

5) D Claim(s) is/are allowed. 

6) ^ Claim(s) 1 to 9 is/are rejected. 

7) S Claim(s) 10 is/are objected to. 

8) D Claim(s) are subject to restriction and/or election requirement. 

Application Papers 

9) D The specification is objected to by the Examiner. 

10)D The drawing(s) filed on is/are: a)D accepted or bO objected to by the Examiner. 

Applicant may not request that any objection to the drawing(s) be held in abeyance. See 37 CFR 1 .85(a). 
11 )□ The proposed drawing correction filed on is: a)D approved b)D disapproved by the Examiner. 

If approved, corrected drawings are required in reply to this Office action. 

12) D The oath or declaration is objected to by the Examiner. 
Priority under 35 U.S.C. §§119 and 120 

13) ^ Acknowledgment is made of a claim for foreign priority under 35 U.S.C. § 1 19(a)-(d) or (f). 

a)g]AII b)D Some*c)D None of: 

1 Certified copies of the priority documents have been received. 

2.Q Certified copies of the priority documents have been received in Application No. . 

30 Copies of the certified copies of the priority documents have been received in this National Stage 
application from the International Bureau (PCT Rule 17.2(a)). 
* See the attached detailed Office action for a list of the certified copies not received. 

14) D Acknowledgment is made of a claim for domestic priority under 35 U.S.C. § 1 19(e) (to a provisional application). 

a) □ The translation of the foreign language provisional application has been received. 

15) D Acknowledgment is made of a claim for domestic priority under 35 U.S.C. §§ 120 and/or 121. 
Attachment(s) 

1) ^ Notice of References Cited (PTO-892) 4) □ Interview Summary (PTO-413) Paper No(s). . 

2) O Notice of Draftsperson's Patent Drawing Review (PTO-948) 5) O Notice of Informal Patent Application (PTO-152) 

3) ^ Information Disclosure Statement(s) (PTO-1449) Paper No(s) 4 . 6) □ Other: 

U.S. Patent and Trademark Office 

PTOL-326 (Rev. 04-01 ) Office Action Summary Part of Paper No. 5 
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DETAILED ACTION 



Specification 



1 . The disclosure is objected to because of the following informalities: 

On page 2, lines 3 to 4, the reference to claims 2 to 7 should be deleted. The 

final numbering of the claims may not reflect the subject matter cited with respect to 

these claims, so reference to any claims should be avoided in the Specification. 
The arrangement of the Specification does not include headings as is 

conventional in patent practice in the United States. 
Appropriate correction is required. 



2. Claim 6 is objected to because of the following informalities: 

The claim would be clearer if the abbreviation "OOV" is spelled out in the claim, 
as "out of vocabulary". 

Appropriate correction is required. 

3. Claim 10 is objected to under 37 CFR 1 .75(c) as being in improper form because 
a multiple dependent claim cannot depend upon another multiple dependent claim. See 
MPEP § 608.01 (n). Accordingly, claim 10 has not been further treated on the merits. 



Claim Objections 
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Claim Rejections - 35 USC §112 



4. The following is a quotation of the second paragraph of 35 U.S.C. 112: 

The specification shall conclude with one or more claims particularly pointing out and distinctly 
claiming the subject matter which the applicant regards as his invention. 

5. Claim 6 is rejected under 35 U.S.C. 112, second paragraph, as being indefinite 
for failing to particularly point out and distinctly claim the subject matter which applicant 
regards as the invention. 

Regarding claim 6, the phrase "especially when" renders the claim indefinite 
because it is unclear whether the limitation following the phrase is part of the claimed 
invention. See MPEP § 2173.05(d). 



6. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 1 02 that 
form the basis for the rejections under this section made in this Office action: 
A person shall be entitled to a patent unless - 

(a) the invention was known or used by others in this country, or patented or described in a printed 
publication in this or a foreign country, before the invention thereof by the applicant for a patent. 

(e) the invention was described in a patent granted on an application for patent by another filed in the 
United States before the invention thereof by the applicant for patent, or on an international application 
by another who has fulfilled the requirements of paragraphs (1 ), (2), and (4) of section 371 (c) of this 
title before the invention thereof by the applicant for patent. 



7. Claims 1 to 6 and 8 are rejected under 35 U.S.C. 102(a) as being clearly 
anticipated by Klakow ("Selecting articles from the language model training corpus"). 

8. Applicants cannot rely upon the foreign priority papers to overcome this rejection 
because a translation of said papers has not been made of record in accordance with 
37CFR1.55. See MPEP §201.15. 



Claim Rejections - 35 USC § 102 
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9. Claims 7 and 9 are rejected under 35 U.S.C. 102(e) as being anticipated by 
Ramaswamy et al. 

Regarding independent claim 7, Ramaswamy et al. discloses a method of 
building language models for speech recognition, characterized in that: 

"a text corpus part of a given first text corpus is gradually extended by one or 
various other text corpus parts of the first text corpus in dependence on text data of an 
application-specific text corpus to form a second text corpus and in that the values of 
the language model are generated while the second text corpus is used" - language 
model constructor 50 reads linguistic units from seed corpus 10 and constructs an initial 
reference language model 80 from these linguistic units; once an initial reference 
language model 80 ("a first text corpus") is constructed, iterative corpus extractor 60 
reads linguistic units ("one or various text corpus parts") from external corpus 20 and 
computes a relevance score for each linguistic unit in accordance with language model 
80; an iterative language model building technique generates a final language model 90 
("a second text corpus") from a small, domain-restricted seed corpus 15 ("in dependent 
on text data of an application-specific text corpus") and a large, less restricted external 
corpus 20; the linguistic units in seed corpus 15 are all highly relevant to a common 
domain or field ("an application-specific text corpus"), and external corpus 20 contains 
text data that is less relevant to the domain of interest than the data within the seed 
corpus; final language model 90 is used in language processing applications (column 2, 
line 40 to column 3, line 63: Figures 1 and 2). 
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Regarding independent claim 9, Ramaswamy et ai discloses a method of 
building language models for speech recognition, characterized in that: 

"a part of a given acoustic training material, which represents a multitude of 
speech utterances, is gradually extended by one or more parts of the given acoustic 
training material and in that the acoustic references of the acoustic model are formed by 
means of the accumulated parts of the given acoustic training material" - language 
model constructor 50 reads linguistic units ("one or more parts of the given acoustic 
training material") from seed corpus 10 and constructs an initial reference language 
model 80 from these linguistic units; once an initial reference language model 80 is 
constructed, iterative corpus extractor 60 reads linguistic units from external corpus 20 
and computes a relevance score for each linguistic unit in accordance with language 
model 80, and incrementally increases the size of the initial reference language model 
80 ("is gradually extended by one or more parts of the given acoustic training material"); 
an iterative language model building technique generates a final language model 90 
("the acoustic model") from a small, domain-restricted seed corpus 15 and a large, less 
restricted external corpus 20; the linguistic units in seed corpus 15 are all highly relevant 
to a common domain or field, and external corpus 20 contains text data that is less 
relevant to the domain of interest than the data within the seed corpus; final language 
model 90 is used in language processing applications (column 2, line 40 to column 4, 
line 7: Figures 1 and 2); implicitly, linguistic units are acoustic units in speech 
recognition. 
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Claim Rejections - 35 USC § 103 



10. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

1 1 . Claims 1 , 2, 5/1 , 5/2, 6/5/1 , 6/5/2, and 8 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Ramaswamy et a/, in view of Bandara et al 

Regarding independent claim 1 , Ramaswamy et a/, discloses a method of 
generating a language model for speech recognition, characterized: 

"in that a first text corpus is gradually [reduced] by one or more various text 
corpus parts in dependence on text data of an application-specific second text corpus" - 
language model constructor 50 reads linguistic units from seed corpus 10 and 
constructs an initial reference language model 80 from these linguistic units; once an 
initial reference language model 80 ("a first text corpus") is constructed, iterative corpus 
extractor 60 reads linguistic units ("one or various text corpus parts") from external 
corpus 20 and computes a relevance score for each linguistic unit in accordance with 
language model 80; an iterative language model building technique generates a final 
language model 90 from a small, domain-restricted seed corpus 15 ("in dependence on 
text data of an application-specific second text corpus") and a large, less restricted 
external corpus 20; the linguistic units in seed corpus 15 ("an application-specific 
second text corpus") are all highly relevant to a common domain or field, and external 
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corpus 20 contains text data that is less relevant to the domain of interest than the data 
within the seed corpus (column 2, line 40 to column 3, line 63: Figures 1 and 2); 

"in that the values of the language model are generated on the basis of the 
[reduced] first text corpus" - final language model 90 is used in language processing 
applications (column 2, line 40 to column 3, line 63: Figures 1 and 2). 

Regarding independent claim 1 , Ramaswamy et al. discloses a method of 
building language models by iteratively increasing the size of a language model by 
adding units from a large external text corpus, where the added units are similar to 
linguistic units in a seed corpus. Thus, Ramaswamy et al. discloses gradually 
increasing the size of the language model but omits gradually reducing the size of the 
language model. Still, one of ordinary skill in the art would recognize that the language 
model building method of Ramaswamy et al. may be reversed in order gradually to 
reduce the size of the language model instead of gradually increasing its size. That is, 
the large external text corpus 20 may be gradually reduced when linguistic units 
iteratively are compared to, and found to be different from, those in the seed corpus. 
Bandara et al. teaches a method for adapting the size of a language model in a speech 
recognition system, where an acoustic distance is calculated, and the contents of the 
language model are reduced with respect to acoustic distance. (Column 5, Lines 20 to 
63: Figure 2) The stated advantage is the size of the language model is reduced, while 
retaining accuracy. (Column 3, Line 56 to Column 4, Line 24) It would have been 
obvious to one having ordinary skill in the art to reverse the language model building 
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process of Ramaswamy et al. as suggested by Bandara et ai for the purpose of 
reducing the size of the language model, while retaining recognition accuracy. 

Regarding claim 2, Bandara etai discloses calculating the language model 
parameters based upon trigram, bigram, and unigram probabilities (column 2, lines 20 
to 67). 

Regarding claim 5/1 and 5/2, Ramaswamy et ai discloses a test corpus ("test 
text") is used by model checker 70 to evaluate the language model quality, calling for 
further language building iterations, if necessary, until its quality is satisfactory (column 

3, lines 6 to 14; column 3, line 64 to column 4, line 7). 

Regarding claim 6/5/1 and 6/5/2, Ramaswamy et ai discloses iterative corpus 
extractor computes a relevance score based upon a perplexity measure relative to a 
threshold to determine how many linguistic units to add to the language model (column 

4, lines 7 to 54). 

Regarding independent claim 8, Ramaswamy et ai discloses a method of 
generating a language model for speech recognition, characterized: 

"in that acoustic training material representing a first number of speech 
utterances is gradually [reduced] by training material parts representing individual 
speech utterances in dependence on a second number of application-specific speech 
utterances" - language model constructor 50 reads linguistic units ("training material 
representing a number of speech utterances") from seed corpus 10 and constructs an 
initial reference language model 80 from these linguistic units; once an initial reference 
language model 80 ("a first number of speech utterances") is constructed, iterative 
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corpus extractor 60 reads linguistic units from external corpus 20 and computes a 
relevance score for each linguistic unit in accordance with language model 80, and 
incrementally increases the size of the initial reference language model 80; an iterative 
language model building technique generates a final language model 90 from a small, 
domain-restricted seed corpus 15 ("in dependence on a second number of application- 
specific speech utterances") and a large, less restricted external corpus 20; the 
linguistic units in seed corpus 15 are all highly relevant to a common domain or field, 
and external corpus 20 contains text data that is less relevant to the domain of interest 
than the data within the seed corpus (column 2, line 40 to column 4, line 7: Figures 1 
and 2); implicitly, linguistic units are acoustic units in speech recognition; 

"in that the acoustic references of the acoustic model are formed by means of the 
[reduced] acoustic training material" - final language model 90 is used in language 
processing applications (column 2, line 40 to column 3, line 63: Figures 1 and 2). 

Regarding independent claim 8, Ramaswamyet ai discloses a method of 
building language models by iteratively increasing the size of a language model by 
adding units from a large external text corpus, where the added units are similar to 
linguistic units in a seed corpus. Thus, Ramaswamyet ai discloses gradually 
increasing the size of the language model but omits gradually reducing the size of the 
language model. Still, one of ordinary skill in the art would recognize that the language 
model building method of Ramaswamy et ai may be reversed in order gradually to 
reduce the size of the language model instead of gradually increasing its size. That is, 
the large external text corpus 20 may be gradually reduced when linguistic units 
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iteratively are compared to, and found to be different from, those in the seed corpus. 
Bandara et al. teaches a method for adapting the size of a language model in a speech 
recognition system, where an acoustic distance is calculated, and the contents of the 
language model are reduced with respect to acoustic distance. (Column 5, Lines 20 to 
63: Figure 2) The stated advantage is the size of the language model is reduced, while 
retaining accuracy. (Column 3, Line 56 to Column 4, Line 24) It would have been 
obvious to one having ordinary skill in the art to reverse the language model building 
process of Ramaswamy et a/, as suggested by Bandara et al. for the purpose of 
reducing the size of the language model, while retaining recognition accuracy. 

12. Claims 3, 4, 5/3, 5/4, 6/5/3, and 6/5/4 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Ramaswamy et al. in view of Bandara et al. as applied to 
claims 1 and 2 above, and further in view of Klakow ("Language-model optimization by 
mapping of corpora"). 

Concerning claim 3, Ramaswamy et al. discloses calculating a relevance score, 
but omits a selection criteria of the equation. However, Klakow ("Language-model 
optimization by mapping of corpora") discloses mapping of training corpora by an n- 
gram perplexity criterion involving the equation. (Page 702, Left Column) This is stated 
to have the advantage of reduced perplexity for speech recognition applications. (Page 
701 ) It would have been obvious to one having ordinary skill in the art to apply the 
equation taught by Klakow ("Language-model optimization by mapping of corpora") as 
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the relevance score of Ramaswamy et a/, for the purpose of reducing perplexity in 
speech recognition applications. 

Concerning claim 4, Bandara et a/, discloses calculating the language model 
parameters based upon trigram, bigram, and unigram probabilities (column 2, lines 20 
to 67). 

Concerning claim 5/3 and 5/4, Ramaswamy et ai discloses a test corpus ("test 
text") is used by model checker 70 to evaluate the language model quality, calling for 
further language building iterations, if necessary, until its quality is satisfactory (column 

3, lines 6 to 14; column 3, line 64 to column 4, line 7). 

Concerning claim 6/5/3 and 6/5/4, Ramaswamy et al. discloses iterative corpus 
extractor computes a relevance score based upon a perplexity measure relative to a 
threshold to determine how many linguistic units to add to the language model (column 

4, lines 7 to 54). 



13. The prior art made of record and not relied upon is considered pertinent to 
Applicants' disclosure. 

Bellegarda discloses related art. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Martin Lerner whose telephone number is (703) 308- 
9064. The examiner can normally be reached on 8:30 AM to 6:00 PM Monday to 
Thursday. 



Conclusion 
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If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on (703) 305-9645. The fax phone 
number for the organization where this application or proceeding is assigned is (703) 
872-9306. 

Any inquiry of a general nature or relating to the status of this application or 
proceeding should be directed to the receptionist whose telephone number is (703) 305- 
4700. 
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